Feature space maximum a posteriori linear regression for adaptation of deep neural networks

نویسندگان

Zhen Huang

Jinyu Li

Sabato Marco Siniscalchi

I-Fan Chen

Chao Weng

Chin-Hui Lee

چکیده

We propose a feature space maximum a posteriori (MAP) linear regression framework to adapt parameters for context dependent deep neural network hidden Markov models (CD-DNNHMMs). Due to the huge amount of parameters used in DNN acoustic models in large vocabulary continuous speech recognition, the problem of over-fitting can be severe in DNN adaptation, thus often impair the robustness of the adapted DNN model. Linear input network (LIN) as a straight-forward feature space adaptation method for DNN, similar to feature space maximum likelihood linear regression (fMLLR), can potentially suffer from the same robustness situation. The proposed adaptation framework is built based on MAP estimation of the LIN parameters by incorporating prior knowledge into the adaptation process. Experimental results on the Switchboard task show that against the speaker independent CD-DNN-HMM systems, LIN provides 4.28% relative word error rate reduction (WERR) and the proposed fMAPLIN method is able to provide further 1.15% (totally 5.43%) WERR on top of LIN.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Maximum a posteriori adaptation of network parameters in deep models

We present a Bayesian approach to adapting parameters of a well-trained context-dependent deep-neural-network hidden Markov models (CD-DNN-HMMs) to improve automatic speech recognition performance. Due to an abundance of DNN parameters but with only a limited amount of adaptation data, the posterior probabilities of unseen CD states (senones) are often pushed towards zero during adaptation, and...

متن کامل

Semi-Supervised Speaker Adaptation for In-Vehicle Speech Recognition with Deep Neural Networks

In this paper, we present a new i-vector based speaker adaptation method for automatic speech recognition with deep neural networks, focusing on in-vehicle scenarios. Our proposed method is, rather than augmenting i-vectors to acoustic feature vectors to form concatenated input vectors for adapting neural network acoustic model parameters, is to perform featurespace transformation with smaller ...

متن کامل

GMM-derived features for effective unsupervised adaptation of deep neural network acoustic models

In this paper we investigate GMM-derived features recently introduced for adaptation of context-dependent deep neural network HMM (CD-DNN-HMM) acoustic models. We improve the previously proposed adaptation algorithm by applying the concept of speaker adaptive training (SAT) to DNNs built on GMM-derived features and by using fMLLR-adapted features for training an auxiliary GMM model. Traditional...

متن کامل

Robust Feature Space Adaptation for T

Speaker adaptation is critical for modern speech recognition systems. Due to the computational and multi-channel model sharing considerations, the use of model adaptation techniques is limited in telephony speech recognition systems. On the other hand, feature space adaptation methods such as feature space maximum likelihood linear regression (fMLLR) are efficient approaches suitable for teleph...

متن کامل

Robust feature space adaptation for telephony speech recognition

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Feature space maximum a posteriori linear regression for adaptation of deep neural networks

نویسندگان

چکیده

منابع مشابه

Maximum a posteriori adaptation of network parameters in deep models

Semi-Supervised Speaker Adaptation for In-Vehicle Speech Recognition with Deep Neural Networks

GMM-derived features for effective unsupervised adaptation of deep neural network acoustic models

Robust Feature Space Adaptation for T

Robust feature space adaptation for telephony speech recognition

عنوان ژورنال:

اشتراک گذاری